Bootstrap Methods for the Cost-Sensitive Evaluation of Classifiers

نویسندگان

  • Dragos D. Margineantu
  • Thomas G. Dietterich
چکیده

Many machine learning applications require classi ers that minimize an asymmetric cost function rather than the misclassi cation rate, and several recent papers have addressed this problem. However, these papers have either applied no statistical testing or have applied statistical methods that are not appropriate for the cost-sensitive setting. Without good statistical methods, it is di cult to tell whether these new cost-sensitive methods are better than existing methods that ignore costs, and it is also di cult to tell whether one cost-sensitive method is better than another. To rectify this problem, this paper presents two statistical methods for the cost-sensitive setting. The rst constructs a con dence interval for the expected cost of a single classi er. The second constructs a condence interval for the expected di erence in costs of two classi ers. In both cases, the basic idea is to separate the problem of estimating the probabilities of each cell in the confusion matrix (which is independent of the cost matrix) from the problem of computing the expected cost. We show experimentally that these bootstrap tests work better than applying standard z tests based on the normal distribution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Performance Evaluation of Machine Learning Classifiers in Sentiment Mining

In recent years, the use of machine learning classifiers is of great value in solving a variety of problems in text classification. Sentiment mining is a kind of text classification in which, messages are classified according to sentiment orientation such as positive or negative. This paper extends the idea of evaluating the performance of various classifiers to show their effectiveness in sent...

متن کامل

Bootstrap Methods for the Cost - Sensitive Evaluation of Classi ersDragos

Many machine learning applications require classiiers that minimize an asymmetric cost function rather than the misclassiication rate, and several recent papers have addressed this problem. However, these papers have either applied no statistical testing or have applied statistical methods that are not appropriate for the cost-sensitive setting. Without good statistical methods, it is dii-cult ...

متن کامل

A parametric model for predicting cut point of hydraulic classifiers

A new parametric model was developed for predicting cut point of hydraulic classifiers. The model directly uses operating parameters including pulp flowrate, feed particle size characteristics, pulp solids content, solid density and particles retention time in the classification chamber and also covers uncontrollable errors using calibration constants. The model applicability was first verified...

متن کامل

Bayesian Methods for the Evaluation of Classifiers

This paper presents a Bayesian approach to estimating the risk (or the expected loss) of classifiers, and discusses some experimental results and the issues that have to be considered when assessing the risk of classifiers. The development of the proposed methodology was motivated by the shortcomings observed in employing the bootstrap tests of Margineantu and Dietterich [10] especially when ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000